Aim 1: Predict degree of improvement in ruminative and depressive symptoms: RRS & HDRS-6.
Aim 2: Determine which treatment-specific models predict across treatment arms.
Rumination Reflection Scale - Reflection dimension (RRSR)
| group | n | mean_age | sd_age | percent_male | hdrs6_baseline | hdrs6_percent_change | rrsr_baseline | rrsr_percent_change |
|---|---|---|---|---|---|---|---|---|
| e | 25 | 39.44000 | 13.72067 | 0.4400000 | 11.320000 | -0.2862191 | 10.92000 | -0.0512821 |
| k | 48 | 38.52083 | 10.64313 | 0.5208333 | 11.541667 | -0.5361011 | 10.89583 | -0.1491396 |
| s | 37 | 33.37838 | 10.77124 | 0.4324324 | 9.783784 | -0.3093923 | 12.24324 | -0.1346578 |
| diff | lwr | upr | p adj | |
|---|---|---|---|---|
| k-e | -0.2104417 | -0.4127054 | -0.0081780 | 0.0394338 |
| s-e | -0.0188892 | -0.2312001 | 0.1934216 | 0.9756563 |
| s-k | 0.1915525 | 0.0121472 | 0.3709577 | 0.0334470 |
| diff | lwr | upr | p adj | |
|---|---|---|---|---|
| k-e | -0.1398505 | -0.3340561 | 0.0543550 | 0.2055795 |
| s-e | -0.1515761 | -0.3554285 | 0.0522763 | 0.1855638 |
| s-k | -0.0117256 | -0.1839834 | 0.1605322 | 0.9856754 |
Global Connectivity
Under and overfitting
Cross validation
Result: Gradient boosted trees consistently outperformed RFs and SVMs. Lower |r| thresholds tend to perform better.
Gradient Boosted Trees Illustration
Unlike RFs and SVMs, gradient boosted trees have a lot of parameters to tune and therefore grid searche spaces are extensive. Manual tuning is difficult so I’ll compare that performance to adaptive grid searches.
| FALSE | TRUE | |
|---|---|---|
| rrsr | 2304 | 2304 |
| hdrs6 | 2304 | 2304 |
Results: Biggest gains for adaptive approach are for ECT arm. Adaptive method does poorly with sleep deprivation arm.
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## NULL
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## NULL
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## NULL
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## NULL
## TableGrob (2 x 3) "arrange": 6 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
## 4 4 (2-2,1-1) arrange gtable[layout]
## 5 5 (2-2,2-2) arrange gtable[layout]
## 6 6 (2-2,3-3) arrange gtable[layout]
## NULL
Approach: Take models that performed decently within treatment (PvAc > 0.1) and use them to predict symptom change across arms. Do models performing better within-treatment perform better across treatment as well?
Results: For both the RRSR and HDRS-6, higher performing ketamine models tend to do better across treatment arms. Sleep deprivation models generally perform poorly within arm but good models still generalize comparably. ECT models don’t tend to generalize. Overall, generalizability is clearly highly dependent on inclusion of baseline symptoms. Some parameterizations, eta and child weight, were highly determinant (not shown).
Approach: Evaluate top 30 models for each treatment-by-outcome combination and evaluate importance of each feature. Average feature importance score across 30 models and compare overlap of top 10 features.
Results: Results suggest that there is little overlap in the top features picked by these models. Gradient boosted trees involve a lot of resampling/subsampling. Given the small number of observations and roughly 16K features available to select from this isn’t too surprising. Model generalizability is more important.
Approach: Median split subjects by their RMSE across 10-repeated 10-fold cross validation folds. Compare demographic and imaging characteristics of subjects with high and low errors.
## Using error as id variables
Results: The model struggled with participants with higher baseline symptoms who subsequently reduced symptoms more, younger participants. Imaging measures were similarly distributed across high and low error participants. Structural imaging measures more similar than functional measures due to a few outliers.
## Using error as id variables
Results: The model did worse with participants that had fewer baseline symptoms (unlike the RRSR) and who were older with a longer symptom duration. Global connectivity measures distributed more orthogonally between error categories.
## Using error as id variables
Results: Model did worse with participants that improved more but baseline symptoms didn’t make a difference. Age, symptom duration, and number of prior episodes somewhat different between classes. Barring a few outliers in functional data, imaging measures were similar.
## Using error as id variables
Results: Similarly, model did worse with participants that did better. Age remains different. Global connectivity measures more orthogonal in PCA space.
## Using error as id variables
Results: Demographic & clinical measures strongly differe by error class. Functional imaging space skewed by several outiers.
## Using error as id variables
Results: Same as RRSR.
Maybe if we only offer predictions about patients that we have a lot of confidence about, the models will do better overall? My naive approach was to say the model is more confident about a patient when their repeated predictions across cross validation are more similar, i.e., lower standard deviation.
Result: Seems like PvAc is higher when we only predict high-variance patients.